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ABSTRACT 


Schaffner's logic of comparative theory evaluation is 
criticised for an inappropriate analysis of ad hocness, 
An alternative analysis, based on Zahar's account of 
novelty, is given and extended to the case of multiple 
Successful predictions by a theory. The application of 
the method to the appraisal of quantitative prediction 
is discussed, 


The Logic of Comparative Theory Evaluation * 


—- 


Tis The Bayesian Analysis of Ad hocness. 
see EA soem Us = 


In a recent note Schaffner ({1974] ) has given a formal discussion 
of the notion of ad hocness in terms of a Bayesian model for the 
appraisal of theories. Schaffner develops his general ideas in 
the context of a critique of Zahar's {1973 | which was concerned 
with the particular problem of comparing the Einstein and Lorentz 
research programmes. Zahar suggests the following analysis of 
ad hocness?; 
_ Ad hoc,: 
novel consequences as compared with its predecessor, 


A theory is said to be ad hoc, if it has no 


Ad hoc,: [A theory}... is ad hoc, if none of its novel 
predictions have been actually 'verified'. 


Ad hoc,: (A)... theory is said to be ad hoc, if it is 
obtained from its predecessor through a modification of 
the auxiliary hypotheses which does not accord with the 
spirit of the heuristic of the programme. 


Zahar explains the meaning of novelty as follows 
A fact will be considered novel with respect to a given 
hypothesis if it did not belong to the problem-situation 
which governed the construction of the hypothesis. 


Schaffner begins his elucidation by discussing the notions of 
ad hoe, and ad hoc. The first he describes as a logical dream, 
Since the novel consequences of a theory cannot in practice be 
"surveyed", so the question of whether a theory is ad hoc, ean 
only be discussed relative to the extent to which novel 
consequences have been looked for at the particular epoch of 
the evaluation, Ad hoc, Schaffner claims to be "vague to the 
point of inapplicability". For Schaffner ad hoc, is "close to 
nn 
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of a paper read in Professor Watkins’ seminar at The London Schoolt 


of Economics in January 1976. 


the sense in which ad hoc is used in science" and he announces 
that his Bayesian analysis will be brought to bear on this 
Sense of ad hoc. 


Denoting by p(T/bx e) the probability of a theory T to be true 
in the light of background knowledge b and the positive outcome 
e of some experiment: not part of b, by p(T/b) the prior 
assessment of T, by p(e/TRb) the probability of obtaining e 
given T and b, and by p(e/b) the probability of obtaining e on 
the basis of background knowledge alone, Schaffner writes 
p(T/bx e) = p(b)-ple/TX d) (2) 
ple/b 
If T explains e we can set p(e/T£Y) = 1 so in this case we 
obtain 
P(T/ERe) = p(t/t) /ple/r) (2) 
Schaffner proceeds to discuss ad hocness as a property of an 
hypothesis as/constituent part of a theory, but in order 
to keep the argument as simple as possible we shall follow 
Zahar in considering the ad hocness of theories. Translated 
into these terms Schaffner's idea is that a theory T gives 
an ad hoc explanation of an experimental result e if p(T/3 ) 
is close to zero and p(e/?) is Significantly larger than 
p(T/2). The argument for p(T/% ) being small is that it 
has no "theoretical Support" (or indeed empirical support 
other than e itself). This looks suspiciously like a 
reference to ad hoc,, and Schaffner now appears to be claiming 
that a theory is ad hoc, Partly in virtue of its being ad hoc,. 
So he is not really giving an independent analysis of ad hoc, 
at all, 


The lack of novelty in the Prediction of e is associated by 
Schaffner with a high value for ple/ ee). On Zahar's account 

this is a necessary condition for lack of novelty, but not a 
sufficient condition. We Proceed to show how using Zahar's notion 


“uf 
of novelty ene can get an "internal Bayesian analysis of ad hoc,. 


We express Bayes's theorem in the following familiar way 


p(T/bRe) = p(T/b), p(e/TR dD)  _ ath 
p(e/TAb)p(T/b) + pe@ATRb). PAs) 


where ~T denotes the negation of T. 


C3) 


We follow Schaffner in interpreting p(A/B) as the degree of 
belief that A is true given that B is true. Note also that e 
is supposed in equation (3) to refer to a prediction made by 
the theory T (we could perhaps write en to emphasize this 
important point) so p(e/~ T&b) means the probability that the 
prediction en derived from the theory T is a true prediction 
given that the background information b is true but the 

theory T is false (i.e., its consequences are not guaranteed to 
be true although "by accident" they may be true). 


Writing p(T/b) = x, p(ehT2Zb) = € 
and taking p(e/T£b) = 1 and using p(«T/b) = 1- x 


p(T/bfe) = x 
x +€(1 - x) m™ 
We define an enhancement ratio Y by 
._p(T/b& e) 
t # p(T/b) 
whence using (4) we obtain the simple result 
eet “ 
— x +£(1 - x) (5) 


We can now explain’that if a theory T is ad hoc, with respect to 
the experiment e then € = 1, i.e. the explanation of e by T in 

no way depends on the truth or falsehood of T, both of which 
eventualities lead with certainty to the result e. This is just 
what a scientist means when he says T was an ad hoc explanation of 
e, namely T was devised for the express purpose of explaining e, 


So the explanation of e is guaranteed independent of whether 

T is true or false. To show the consistency of our analysis 

if we put £€ = 1 in (5) we pet VY = 1, so the posterior and prior 
probabilities of T are equal (there is no enhancement) and this 
again is just what we expect from an ad hoc explanation of e, 
namely e itself gives us no additional information for assessing 
the truth of 7.2) | 

Notice that p(e/b) = x +£(1- x) is equal to unity cle i ae 
but that p(e/2.) = 1 is achieved for x = 1 whatever the maylue 
of © , so the explication of novelty in terms of low p(e/ 2) 
(Schaffner) and small & (Zahar) are by no means equivalent. 


On our analysis if xZ€£1, then) ~ VY€ , so in this case we 

get a big enhancement and the theory is far from being ad hoc. 
Noting that under these conditions p(e/% )x E, we see that this is 
a situation in which Schaffner would claim that the theory was 

ad hoc, which highlights the way in which his analysis differs 

from ours. Effectively Schaffner requires Y X to be small as 

his condition of ad hocness. He is thus concerned with the 
absolute value of the probability of a theory after it has 
explained some experimental finding. If this absolute value 

is still small the theory is to be regarded as an ad hoc explanation 
of the experiment. On our account the important aspect in 
assessing ad hocness for this case is not the absolute value of 

the probability but the enhancement ratio. Our point against 
Schaffner is not that his analysis may not explicate some 
legitimate sense of the ambiguous appellation ad hoc, but that it 
fails to explicate Zahar's very important notion of ad hoc,. There 
is no inconsistency here on Schaffner's part, since he finds 
Zahar's account of ad hoc, inadequate in respect of the definition 
of novelty involved with its =e historical associations, 
but we would maintain that Zahar's sense of ad hoc, is the one that 
ought to be explicated since it is the one most importantly used 


in science. 


2. The Case of Multiple Predictions 


To develop our analysis a little further we can consider how a 
theory builds up a favourable appraisal as it makes a number of 
Successful predictions Qh: Car21e, Say. Denote by P. the 
posterior probability p(T/p£e, Xe,...Re,) after s successful 
predictions, Assuming the predictions are quite independent 
and for simplicity are all associated with the same value of a 
we can clearly write 


pn = y¥™ , yORl es yO. p 


fa) 
where Py = xX according to our previous notation 
1 
and x (s) ee Re 
Ps-1 a te 
2 (s+1) e 
Pee = OY. « Pe 


The solution of this recursion is by inspection or more simply 
by replacing ¢ oy er in the formula for Py (see (4) above), We 
obtain 
_ 1 

Pn ~ ae ens eM /x (6) 
We can also ask what is the probability for the (n41)%2 
prediction © yes) being correct if the theory has already made n 
Successful predictions ae Denoting this probability 
by p(€ n+1) we clearly obtain the result 


S 1-€ : 

1-E 4 EM/y (7) 
If € is a small quantity (ice. <<1) which will be the case if 7 
is non-ad hoc, with respect to all the Predictions, we can 
write the following formulae which will be perfectly satisfactory 
for the subsequent discussion 


Pie ees 
e°/x (8) 


p(€n+ 1) = 


p(*nel)& € + py » (9) 


C6. iretreeetinnses (10) 
’ é* Pa = 7 


We note the following features 
(1) The value .of pn depends entirely on the ratio £"/x 
For €"/x D1, Pra K 1 
and for €"/x €1 5 BRZE A 
At the critical point €" = x, we have pn 2 1/2, 


So if initially x<€ as n increases pn will rise 
steeply as we reach the value n = Lnx/br € ; 


(2) So long as pn&é, we have p(@nt1) =€, but as pn builds 
up towards unity, so does p(2ntl), 


(3) So long as bn 24 We, Ps Ve » but as bn - 4 builds 


up towards unity the enhancement factor 9) sim 
tends to unity. 


To take a concrete case we illustrate in the accompanying figure 
Pn and p(€&n+1) as functions of n for the particular choice 
x= O01, ©& = Our. 


Probability 
O89 «6 
QO B® + 


3. Quantitative Predictions 
veenvitative rreaictions 


We can apply this analysis to the important case of quantitative 
predictions.” Suppose a theory T predicts correctly an 
experimental result which is known to. an accuracy of n significant 
figures. We assume as part of our background knowledge that 

the order of magnitude of the result is known, i.e. we 

disregard the prediction of zeros occuring before or after the 
Significant figures in the experimental result. As a concrete 
illustration we cite the theoretical predictions by quantum 
electrodynamics of the anomaly in the hydrogen spectrum known as 
the Lamb shift and the anomaly in the magnetic moment of the 
electron. For example the latter is now known experimentally 

to be (1159652442) x 10°40 Bohr magnetons” whereas the theoretical 
prediction is (1159652446) x 19710 Bohr magnetons.’ We thus 
have remarkable agreement to seven significant figures. Clearly 
if we regard the prediction of each significant figure as an 
independent event then the appropriate value to take for & is 
0.1 since a false theory would have ten equal possibilities for 
filling in each digit. The question of what value to take for 

x is somewhat arbitrary. In agreement with Schaffner we do not 
follow a purely logical approach and set x = O. For our purpose 
x reflects the scientist's confidence in the new theory T. One 
could argue that a scientist would not spend great efforts 
developing the consequences of a theory he did not believe in. 

By analogy with the situation in the Bayesian analysis of 
significance testing (see for example Redhead ({a974))) we could 
take x = }. Perhaps more realistically we should take xX around 
0.01 and adopt the sociological rule that unless a scientist has 
a one per cent level of confidence in the truth of his theory he 
would not seriously investigate it®, With this choice of 
parameters we see that thé build-up of confidence in a theory which 
makes correct quantitative predictions, as the accuracy of the 
experiment increases, would be illustrated by the graph for bn in 
the fieupe, ? Of course P(2y4) is only given by our analysis 
so long as other factors which could potentially influence the 
results are known not to be significant. For a certain value 


